---
title: "Supplement for in Decomposing Modal Thought"
author: "Jonathan Phillips & Angelika Krazter"
date: "November, 2022"
output:
  pdf_document: default
  word_document: default
  html_document:
    df_print: paged
csl: apa.csl
bibliography: modality.bib
---

```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = FALSE,dpi=300,fig.width=7)

#rm(list=ls())

blackGreyPalette <- c("#2C3539", "#999999") 
library(lme4)
library(lsr)
library(knitr)
library(tidyverse)

```


## Normality constraints

We experimentally investigated the defeasible normality constraints that play a role in restricting the modal domain, and in particular focused on their defeasibility.

### Methods

#### Participants

```{r nomralityConstraints data}

d1 <- read.csv("data/normalityConstraints.csv",stringsAsFactors = F)

```

We collected a sample of `r length(unique(d1$ResponseId))` participants ($M_{age}$ = `r round(mean(d1$age,na.rm=T),digits=2)`; $SD_{age}$ = `r round(sd(d1$age,na.rm=T),digits=2)`; `r table(d1$gender)[[2]]` females) from Amazon Mechanical Turk ([www.mturk.com]( www.mturk.com)).

#### Procedure

Participants were randomly assigned to one of two conditions. In one condition, participants were first presented with the following question: 

> A child was born two years ago. The child was born from its mother's first pregnancy and its mother died a year later before becoming pregnant again.
>
> Do you agree or disagree?
>
> __*The child must not have any siblings.*__

In the other condition, participants first read a slight variation which the possibility of multiple births should be less likely to be excluded because of implicit normality constraints:

>A dog was born two years ago. The dog was born from its mother's first pregnancy and its mother died a year later before becoming pregnant again.
>
> Do you agree or disagree?
>
> __*The dog must not have any siblings.*__

In both cases, participants' rated their agreement on a scale from 0 ('Disagree') to 100 ('Agree').

Then the possibility of there being multiple births from a single pregnancy was explicitly raised. In the condition with a child, we told participants: 

> Now please carefully consider the following possibility.
> 
> __*The mother could have had twins, triplets, or even more babies at once.*__

and in the dog condition, we told participants:

> Now please carefully consider the following possibility.
>
> __*The mother could have had given birth to two, three, or even more puppies at once.*__

Participants were then asked to rate their agreement with the must claim again, on the same scale as before. 

Participants then completed a question which asked them to write down what the new possibility raised was, and then told us whether they had considered that possibility before we explicitly pointed it out. Lastly, participants completed a brief demographic questionnaire.

### Results


```{r exp3 must, message=FALSE}

d1$time <- rowSums(d1[,c(21,26,31,36)],na.rm = T)
#hist(d1$time[d1$time<100], breaks=50,col="Red")

d1.l <-d1 %>% select(c(9,18,23,28,33,38:39,47)) %>%
              #filter(time > 20) %>%
              gather(question, must, 2:5, na.rm = TRUE) %>%
              mutate(question = factor(question),
                     condition = factor(c("Child","Child","Dog","Dog")[question]),
                     challenge = factor(rep(c("Before Challenge","After Challenge"),2)[question]),
                     challenge = factor(challenge, levels=c("Before Challenge","After Challenge"))
              )


lm1.0 <- lmer(must ~ condition * challenge + (1|ResponseId), data=d1.l, REML= F)

## interactiin
lm1.1 <- lmer(must ~ condition + challenge + (1|ResponseId), data=d1.l, REML= F)
lm1.inter <- anova(lm1.0,lm1.1)

lm1.inter.df <- lm1.inter$Df[2]
lm1.inter.chi <- lm1.inter$Chisq[2]
lm1.inter.p <- lm1.inter$`Pr(>Chisq)`[2]


## main effect of condition
lm1.2 <- lmer(must ~ challenge + (1|ResponseId), data=d1.l, REML= F)
lm1.cond <- anova(lm1.1,lm1.2)

lm1.cond.df <- lm1.cond$Df[2]
lm1.cond.chi <- lm1.cond$Chisq[2]
lm1.cond.p <- lm1.cond$`Pr(>Chisq)`[2]

## main effect of challenge
lm1.3 <- lmer(must ~ condition + (1|ResponseId), data=d1.l, REML= F)
lm1.chal <- anova(lm1.1,lm1.3)

lm1.chal.df <- lm1.chal$Df[2]
lm1.chal.chi <- lm1.chal$Chisq[2]
lm1.chal.p <- lm1.chal$`Pr(>Chisq)`[2]


d1.sums <- d1.l %>% group_by(condition,challenge) %>%
                    summarize(n_must = length(must),
                                mean_must = mean(must, na.rm=TRUE),
                                sd_must = sd(must, na.rm=TRUE),
                                se_must   = sd_must/ sqrt(n_must)
                      )

#var.test(d1.l$must[d1.l$condition=="Child" & d1.l$challenge=="Before Challenge"],
#         d1.l$must[d1.l$condition=="Child" & d1.l$challenge=="After Challenge"])
d1.child.t <- t.test(d1.l$must[d1.l$condition=="Child" & d1.l$challenge=="Before Challenge"],
                    d1.l$must[d1.l$condition=="Child" & d1.l$challenge=="After Challenge"])
d1.child.d <- cohensD(d1.l$must[d1.l$condition=="Child" & d1.l$challenge=="Before Challenge"],
                      d1.l$must[d1.l$condition=="Child" & d1.l$challenge=="After Challenge"])

# var.test(d1.l$must[d1.l$condition=="Dog" & d1.l$challenge=="Before Challenge"],
#         d1.l$must[d1.l$condition=="Dog" & d1.l$challenge=="After Challenge"])
d1.dog.t <- t.test(d1.l$must[d1.l$condition=="Dog" & d1.l$challenge=="Before Challenge"],
                    d1.l$must[d1.l$condition=="Dog" & d1.l$challenge=="After Challenge"], var.equal = T)
d1.dog.d <- cohensD(d1.l$must[d1.l$condition=="Dog" & d1.l$challenge=="Before Challenge"],
                      d1.l$must[d1.l$condition=="Dog" & d1.l$challenge=="After Challenge"])
```

We first analyzed participants agreement with the 'must not' claim using linear mixed effects models with participants as a random factor and Condition (child vs. dog) and Challenge (before challenge vs. after challenge) as fixed factors that were allowed to interact. This analysis revealed a main effect of Condition, $\chi^2$(`r lm1.cond.df`)$=$ `r round(lm1.cond.chi, digits=3)`, $p <$ `r max(.001, round(lm1.cond.p,digits=3))`, a main effect of Challenge, $\chi^2$(`r lm1.chal.df`)$=$ `r round(lm1.chal.chi, digits=3)`, $p <$ `r max(.001, round(lm1.chal.p,digits=3))`, and critically, a highly significant interaction effect as predicted, $\chi^2$(`r lm1.inter.df`)$=$ `r round(lm1.inter.chi, digits=3)`, $p <$ `r max(.001, round(lm1.inter.p,digits=3))` (Figure 1). 

We investigated this interaction effect with planned comparisons that separately focused only on the child and dog cases. Participants largely agreed with the claim that the child must not have any siblings before the challenge ($M =$ `r round(d1.sums$mean_must[1], digits=3)`; $SD =$ `r round(d1.sums$sd_must[1], digits=3)`), but this agreement was reduced drastically after the possibility of twins or triplets was raised explicitly ($M =$ `r round(d1.sums$mean_must[2], digits=3)`; $SD =$ `r round(d1.sums$sd_must[2], digits=3)`), $t$(`r round(d1.child.t$parameter, digits=2)`) $=$ `r round(d1.child.t$statistic,digits=3)`, $p <$ `r max(.001,round(d1.child.t$p.value, digits=3))`, $d =$ `r round(d1.child.d, digits=3)`. By contrast, participants did not strongly agree with the claim that the dog must not have any siblings, even before the challenge, ($M =$ `r round(d1.sums$mean_must[3], digits=3)`; $SD =$ `r round(d1.sums$sd_must[3], digits=3)`), and explicitly raising the possibility of multiple children at birth had a small effect on their agreement ($M =$ `r round(d1.sums$mean_must[4], digits=3)`; $SD =$ `r round(d1.sums$sd_must[4], digits=3)`), $t$(`r round(d1.dog.t$parameter, digits=2)`) $=$ `r round(d1.dog.t$statistic,digits=3)`, $p =$ `r max(.001,round(d1.dog.t$p.value, digits=3))`, $d =$ `r round(d1.dog.d, digits=3)`.

```{r normality constraints fig1, fig.cap="**Figure 1**. Agreement with 'must not' claim as a function of both condition and challenge."}
                   
d1.fig1 <- d1.l %>%
             ggplot(aes(y=must, x=condition, fill=challenge)) +
                    geom_boxplot() +
                    geom_point(stat="identity", position=position_jitterdodge(jitter.width = .25), color="Black",alpha=.2)+
                    #scale_fill_manual(values=blackGreyPalette) + 
                    ylab("Agreement that child/dog must not have any siblings") +
                    #facet_grid(~kind, scales = "free_x") +
                    xlab("Who was the claim about?") +
                    coord_cartesian(ylim=c(0,100)) +
                    theme_bw() +
                    theme(
                      plot.background = element_blank()
                      ,panel.grid.major = element_blank()
                      ,panel.grid.minor = element_blank()
                      #,legend.position="null"
                      ,legend.title=element_blank()
                      ,legend.text=element_text(size=rel(1))
                      ,axis.text.x=element_text(size=rel(1))
                      ,axis.text.y=element_text(size=rel(1))
                      ,axis.title=element_text(size=rel(1))
                      ,axis.title.y = element_text(vjust = 0.75)
                      ,plot.title = element_text(face="bold",vjust=.75,size=rel(1.75))
                    )
print(d1.fig1)

d1.l$PossCheck <- factor(d1.l$PossCheck, levels=c("No","Yes","Other"))

d1.possCheck <- d1.l %>% filter(challenge=="Before Challenge") %>% ## NB: Converstion to long format doubled instances of PossCheck
                         group_by(condition,PossCheck) %>%
                         tally() %>%
                         spread(PossCheck,n)


d1.chisq <- chisq.test(d1.possCheck[,2:3])

```

We next investigated whether this difference arose from differences in participants' tendency to exclude the possibility of multiple births from a single pregnancy in the case of a child (where they are relatively abnormal), but not in the case of the dog (where this possibility is relatively normal). To do this, we first considered participants responses to the question that asked them whether they considered this possibility before we raised it, and found that this tendency shifted drastically between conditions,  $\chi^2$(`r d1.chisq$parameter`) $=$ `r round(d1.chisq$statistic, digits=3)`, $p <$ `r max(.001, round(d1.chisq$p.value,digits=3))` (Table 1).


```{r exp3 possAnalyses, message=FALSE}

kable(d1.possCheck, caption = "**Table 1**.", col.names=c("Condition","No","Yes","Other"))

d1.lpc <- d1.l %>% filter(PossCheck!="Other")

lm2.10 <- lmer(must ~ PossCheck * challenge + (1|ResponseId), data=d1.lpc, REML= F)

## interactiin
lm2.11 <- lmer(must ~ PossCheck + challenge + (1|ResponseId), data=d1.lpc, REML= F)
lm2.1inter <- anova(lm2.10,lm2.11)

lm2.1inter.df <- lm2.1inter$Df[2]
lm2.1inter.chi <- lm2.1inter$Chisq[2]
lm2.1inter.p <- lm2.1inter$`Pr(>Chisq)`[2]



## main effect of condition [Not reported]
lm2.12 <- lmer(must ~ challenge + (1|ResponseId), data=d1.lpc, REML= F)
lm2.1cond <- anova(lm2.11,lm2.12)


## main effect of challenge [Not reported]
lm2.13 <- lmer(must ~ PossCheck + (1|ResponseId), data=d1.lpc, REML= F)
lm2.1chal <- anova(lm2.11,lm2.13)


## mediation analysis:

## this is here to show how we ran the mediation, but commented out to make compiling faster and so you don't have problems where the mediation package creates conflicts with tidyverse

# library(mediation)
# 
# d1m <- d1.l %>% filter(PossCheck!="Other", challenge=="Before Challenge") %>% mutate(PossCheckn = (as.numeric(factor(PossCheck))*-1)+2)
# 
# med.fit <- glm(PossCheckn ~ condition, data = d1m, family = binomial("probit"))
# 
# out.fit <- lm(must ~ PossCheckn + condition, data = d1m)
# 
# med.out <- mediate(med.fit, out.fit, treat = "condition", mediator = "PossCheckn", control.value="Dog", treat.value = "Child", sims = 1000)
# 
# saveRDS(med.out,file="models/med_out.rda")
# 
# detach(package:mediation)

med.out <- readRDS("models/med_out.rda")

```

While these data demonstrate that our manipulation worked as intended, we have not yet shown that this difference is what actually explains the difference in participants' agreement with the 'must' claim. Two further analyses suggest that this is the case. First, collapsing across Condition (child vs. dog), one finds whether or not participants considered the possibility of multiple births has the predicted effect on agreement with the 'must' claim, both before and after the possibility is explicitly raised (Fig. 2). Statistically, this effect can be captured by the interaction effect between whether or not the possibility had been explicitly raised, and whether participants considered it before it was raised, $\chi^2$(`r lm2.1inter.df`)$=$ `r round(lm2.1inter.chi, digits=3)`, $p <$ `r max(.001, round(lm2.1inter.p,digits=3))`.


```{r exp3 possFig, fig.cap="**Figure 2**. Agreement with 'must not' claim as a function of both challenge and whether the possibility was previously considered."}

d1.fig2 <- d1.l %>% filter(PossCheck!="Other") %>%
             ggplot(aes(y=must, x=PossCheck, fill=challenge)) +
                    geom_boxplot() +
                    geom_point(stat="identity", position=position_jitterdodge(jitter.width = .25), color="Black",alpha=.2)+
                    #scale_fill_manual(values=blackGreyPalette) + 
                    ylab("Agreement that child/dog must not have any siblings") +
                    #facet_grid(~kind, scales = "free_x") +
                    xlab("Did you consider the possibility of multiple births before we raised it?") +
                    coord_cartesian(ylim=c(0,100)) +
                    theme_bw() +
                    theme(
                      plot.background = element_blank()
                      ,panel.grid.major = element_blank()
                      ,panel.grid.minor = element_blank()
                      #,legend.position="null"
                      ,legend.title=element_blank()
                      ,legend.text=element_text(size=rel(1))
                      ,axis.text.x=element_text(size=rel(1))
                      ,axis.text.y=element_text(size=rel(1))
                      ,axis.title=element_text(size=rel(1))
                      ,axis.title.y = element_text(vjust = 0.75)
                      ,plot.title = element_text(face="bold",vjust=.75,size=rel(1.75))
                    )

print(d1.fig2)

```

Second, we asked whether the effect of condition on participants' agreement with the 'must' claim before the possibility was raised was mediated by whether or not they considered the possibility of multiple births. A bootstrap mediation analysis revealed that whether the possibility was considered fully mediated effect of Condition on agreement with the 'must' claim: $95\%$ *CI* of proportion mediated [`r round(med.out$n0.ci[[1]],digits=2)`, `r round(med.out$n0.ci[[2]],digits=2)`], $p <$ `r max(.001, round(med.out$n.avg.p,digits=3))`.


## Modal Anchors

We empirically investigated whether ordinary truth-value judgments of sentences involving different modal terms differ in the predicted way in the context of Lewis's (1997) sorcerer context. We additionally assessed participants' subjective reports of the relevant modal anchor when evaluating the truth of different modal terms.

### Methods

#### Participants

```{r modalAnchors data, echo=FALSE, message=FALSE, warning=FALSE}

d2.0 <- read.csv("data/modalAnchor_lewis.csv") # could v.s might
d2.1 <- read.csv("data/modalAnchor_lewis_fragile.csv") # fragile vs. could
d2.2 <- read.csv("data/modalAnchor_lewis_might.csv") # might
d2.3 <- read.csv("data/modalAnchor_lewis_vulnerable.csv") # fragile / could / vulnerable

subjIDs <- c("politics","religion","gender","age","race","race_5_TEXT","eng")

d2.0.subs <- d2.0 %>% select(all_of(subjIDs))
d2.1.subs <- d2.1 %>% select(all_of(subjIDs))
d2.2.subs <- d2.2 %>% select(all_of(subjIDs))
d2.3.subs <- d2.3 %>% select(all_of(subjIDs))

```

All participants were recruited through Amazon Mechanical Turk and the survey was administered through Qualtrics. In **Study 1**, we recruited `r length(d2.0.subs$eng)` participants (*M*~age~=`r round(mean(d2.0.subs$age, na.rm=T),digits=2)`, *SD*~age~=`r round(sd(d2.0.subs$age, na.rm=T),digits=2)`; `r table(d2.0.subs$gender)[[1]]` females); in **Study 2**, we recruited `r length(d2.1.subs$eng)` participants (*M*~age~=`r round(mean(d2.1.subs$age, na.rm=T),digits=2)`, *SD*~age~=`r round(sd(d2.1.subs$age, na.rm=T),digits=2)`; `r table(d2.1.subs$gender)[[1]]` females); in **Study 3**, we recruited `r length(d2.2.subs$eng)` participants (*M*~age~=`r round(mean(d2.2.subs$age, na.rm=T),digits=2)`, *SD*~age~=`r round(sd(d2.2.subs$age, na.rm=T),digits=2)`; `r table(d2.2.subs$gender)[[2]]` females); in **Study 4**, we recruited `r length(d2.3.subs$eng)` participants (*M*~age~=`r round(mean(d2.3.subs$age, na.rm=T),digits=2)`, *SD*~age~=`r round(sd(d2.3.subs$age, na.rm=T),digits=2)`; `r table(d2.3.subs$gender)[[1]]` females).
<!-- NB: for Study 3, the table(gender)[[2]] here is because 1 subj didn't respond to the gender q, so female shifted right-->

#### Procedure

All participants read the following scenario: 

> Imagine that there is a production line that produces a special kind of glass, each of which is very pretty but is also extremely fragile. Each glass is sure to break from even a slight bump. 

> A magical sorcerer takes a liking to a glass from this production line and decides to protect the glass. He doesn't do anything to change the glass itself;  he likes the glass exactly the way it is. He only watches and waits, resolved that if ever his glass is struck, then, quick as a flash, he will cast a spell that changes the glass, renders it no longer fragile, and thereby aborts the process of breaking.

After reading this scenario, participants rated their agreement with one of several possible sentences about the glass the sorcerer liked:

> The glass is fragile. [*Fragile*]

> The glass could break. [*Could*]

> The glass might break. [*Might*]

> The glass is vulnerable. [*Vulnerable*]

In all cases, participants indicated their agreement with the sentence on a scale from (0) "Completely disagree" to (100) "Completely agree. 

After answering this question, participants were asked the following question about the judgment they just made:

> When you answered the last question, were you thinking only about the glass itself, or about the situation the glass was in and the sorcerer who wants to protect it?

They selected one of three options: (a) "Only about the glass", (b) "About the glass and the sorcerer", or (c) "Other" for which they could optionally write in an answer.

The four studies varied only in which set of questions participants were asked. In Study 1, participants evaluated *Could* and *Might*; in Study 2, participants evaluated *Fragile* and *Could*; in Study 3 participants evaluated *Might*; in Study 4, participants evaluated *Could*, *Fragile*, and *Vulnerable*. 


### Results

```{r graphs, echo=FALSE, warning=FALSE, message=FALSE}

d2.0.sum <- d2.0 %>% 
             select(9,18:19,22,24:25,32) %>%
             rename(time=time_Page.Submit, might=might_1,could=could_1) %>%
             #filter(time>10) %>%
             pivot_longer(cols=c(2,3),"question",values_to = "response",values_drop_na = T) %>%
             mutate(question = factor(question),
                    question = factor(c("could","might")[question])
                    )

d2.1.sum <- d2.1 %>% 
             select(c(9,18:19,22,24:25,32)) %>%
             rename(time=time_Page.Submit, fragile=fragile_1,could=could_1) %>%
             #filter(time>10) %>%
             pivot_longer(cols=c(2,3),"question",values_to = "response",values_drop_na = T) %>%
             mutate(question = factor(question),
                    question = factor(c("could","fragile")[question])
                    ) 

d2.2.sum <- d2.2 %>% 
             select(c(9,18,21,23:24,31)) %>%
             rename(time=time_Page.Submit, might=could_1) %>% ##forgot to rename the question in the survey...
             #filter(time>10) %>%
             pivot_longer(cols=c(2),"question",values_to = "response",values_drop_na = T) %>%
             mutate(question = factor(question),
                    question = factor(c("might")[question])
                    ) 

d2.3.sum <- d2.3 %>% 
             select(c(9,18:20,23,25:26,33)) %>%
             rename(time=time_Page.Submit, fragile=fragile_1,could=could_1,vulnerable=vulnerable_1) %>%
             #filter(time>10) %>%
             pivot_longer(cols=c(2:4),"question",values_to = "response",values_drop_na = T) %>%
             mutate(question = factor(question),
                    question = factor(c("could","fragile","vulnerable")[question])
                    )

d2 <- rbind(d2.0.sum,d2.1.sum,d2.2.sum,d2.3.sum )

d2.aov <- anova(lm(response ~ question,data=d2))
d2.eta <- etaSquared(lm(response ~ question,data=d2))


fragile_vulnerable <- t.test(d2$response[d2$question=="fragile"],
                             d2$response[d2$question=="vulnerable"])

fragile_vulnerable.d <- cohensD(d2$response[d2$question=="fragile"],
                                d2$response[d2$question=="vulnerable"])

vulernable_could <- t.test(d2$response[d2$question=="vulnerable"],
                           d2$response[d2$question=="could"])

vulernable_could.d <- cohensD(d2$response[d2$question=="vulnerable"],
                              d2$response[d2$question=="could"])

could_might <- t.test(d2$response[d2$question=="could"],
                      d2$response[d2$question=="might"])

could_might.d <- cohensD(d2$response[d2$question=="could"],
                        d2$response[d2$question=="might"])


## Stats reported in the main article:

fragile_could <- t.test(d2$response[d2$question=="fragile"],
                        d2$response[d2$question=="could"])

fragile_could.d <- cohensD(d2$response[d2$question=="fragile"],
                           d2$response[d2$question=="could"])

could_fragile.var <-var.test(d2$response[d2$question=="could"],
                             d2$response[d2$question=="fragile"])

```

Given the identical nature of the study procedures and overlapping conditions across the four studies, the data were combined and analyzed jointly. Overall, there was a clear effect of question on participants' agreement $F$(`r d2.aov$Df[1]`)$=$ `r round(d2.aov$'F value'[1],digits=2)`, $p<$ `r max(round(d2.aov$'Pr(>F)'[1],digits=3),.001)`, $\eta_p^2=$ `r round(d2.eta[2],digits=3)` (see Fig 3.) More specifically, participants more agreed with *Fragile* than *Vulnerable* $t$(`r round(fragile_vulnerable$parameter,digits=2)`) $=$ `r round(fragile_vulnerable$statistic,digits=2)`, $p=$ `r round(fragile_vulnerable$p.value,digits=3)`, $d=$ `r round(fragile_vulnerable.d,digits=3)` and more agreed with *Vulnerable* than *Could* $t$(`r round(vulernable_could$parameter,digits=2)`) $=$ `r round(vulernable_could$statistic,digits=2)`, $p=$ `r round(vulernable_could$p.value,digits=3)`, $d=$ `r round(vulernable_could.d,digits=3)`. The difference in agreement between *Could* and *Might* was not statistically significantly ($\alpha=.05$), $t$(`r round(could_might$parameter,digits=2)`) $=$ `r round(could_might$statistic,digits=2)`, $p=$ `r round(could_might$p.value,digits=3)`, $d=$ `r round(could_might.d,digits=3)`.

As reported in the main article, we additionally found that there was more variation in participants' agreement ratings with the *Could* claim, than the with the *Fragile* claim $F$(`r could_fragile.var$parameter[1]`,`r could_fragile.var$parameter[2]`) $=$ `r round(could_fragile.var$statistic,digits=3)`, $p<$ `r max(.001,could_fragile.var$p.value)`.

```{r Figure3, echo=FALSE,fig.cap="**Figure 3.** Boxplots of participants\' agreement ratings with each of the four sentences about the Sorcerer\'s glass.", fig.width=5}

d2.fig <- d2 %>% 
             mutate(question = factor(question,
                                      levels=c("might","could","vulnerable","fragile"))) %>%
             #mutate(question = factor(c("Might","Could","Vulnerable","Fragile")[question])) %>%
                     #filter(question %in% c("Could","Fragile")) %>% 
                     #mutate(question = factor(question, levels=c("Fragile","Could"))) %>%
             ggplot(aes(x=question, y=response, fill=question)) +
                    geom_boxplot() +
                    geom_point(stat="identity", 
                               position=position_jitterdodge(jitter.width = .5), 
                               color="Black",alpha=.25) +
                    #scale_fill_manual(values=blackGreyPalette) + 
                    ylab("Agreement rating") +
                    xlab("") +
                    coord_cartesian(ylim=c(0,100)) +
                    theme_bw() +
                    theme(
                      plot.background = element_blank()
                      ,panel.grid.major = element_blank()
                      ,panel.grid.minor = element_blank()
                      ,legend.position="null"
                      ,legend.title=element_blank()
                      ,legend.text=element_text(size=rel(1.5))
                      ,axis.text.x=element_text(size=rel(1.5))
                      ,axis.text.y=element_text(size=rel(1.5))
                      ,axis.title=element_text(size=rel(1.5))
                      ,axis.title.y = element_text(vjust = 0.75)
                      ,plot.title = element_text(face="bold",vjust=.75,size=rel(1.75))
                    )



plot(d2.fig)


```

```{r anchorsAnalysis, echo=FALSE,  message=FALSE}

d2.anchor <- d2 %>% filter(anchorTest!="Other") %>% 
  mutate(anchorTest = as.numeric(factor(anchorTest)),
        question = factor(question, levels=c("fragile","vulnerable","could","might"))
         )

d2a.coef <- coef(summary(glm(anchorTest ~ question,data=d2.anchor)))

d2.sum <- d2 %>% mutate(anchor = case_when(anchorTest == "About the glass and the sorcerer" ~ 1,
                                           anchorTest == "Only about the glass" ~ 0)
                        )%>%
                     group_by(question) %>%
                     summarize(N  = length(response),
                             mean = mean(response, na.rm=TRUE),
                             sd = sd(response, na.rm=TRUE),
                             se   = sd/ sqrt(N),
                             anchor = mean(anchor,na.rm=T)
                      ) %>%
                    mutate(
                      question = factor(question,
                                        levels=c("might","could","vulnerable","fragile"))
                    )


## stat reported in the main article:
d2.counts <- d2 %>% filter(question %in% c("could","fragile")) %>% 
                  group_by(question,anchorTest) %>%
                  summarize(count = n()) %>%
                  filter(anchorTest!="Other") %>%
                  pivot_wider(names_from = anchorTest,values_from = count) %>%
                  unclass() %>%
                  as.data.frame()

c.counts.chi <- chisq.test(d2.counts[,2:3])


```

Participants' responses for whether they were thinking of the sorcerer when rating their agreement with each of the four sentences produced an inverse pattern (Fig. 4). Specifically, when rating their agreement with *Fragile*, most participants did not think about the sorcerer (`r round(d2.sum$anchor[d2.sum$question=="fragile"],digits=2)`%). Moreover, compared to *Fragile*, participants were significantly more likely to think about the sorcerer when rating their agreement with *Vulnerable* (`r round(d2.sum$anchor[d2.sum$question=="vulnerable"],digits=2)`%), $p =$ `r round(d2a.coef[2,4],digits=3)`; participants were even more likely to think about the sorcerer when rating their agreement with *Could* (`r round(d2.sum$anchor[d2.sum$question=="could"],digits=2)`%), $p <$ `r max(.001,round(d2a.coef[3,4],digits=3))`, and *Might* (`r round(d2.sum$anchor[d2.sum$question=="might"],digits=2)`%), $p <$ `r max(.001,round(d2a.coef[4,4],digits=3))`.

```{r Figure4, fig.cap="**Figure 4.** Probability that participants were *not* thinking about the scorcerer when rating their agreement with each of the four sentences about the glass.", fig.width=5}

d2.fig2 <- d2.sum %>%  
          # mutate(question = factor(c("Might","Could","Vulnerable","Fragile")[question])) %>%
          #            filter(question %in% c("Could","Fragile")) %>%
          #            mutate(question = factor(question, levels=c("Fragile","Could"))) %>%
                ggplot(aes(y=anchor, x=question, fill=question)) +
                    geom_bar(position="dodge", stat="identity")  +
                    ylab("P(sorcerer in modal anchor)") + # \U2284
                    xlab("") +
                    coord_cartesian(ylim=c(0,1)) +
                    theme_bw() +
                    theme(
                      plot.background = element_blank()
                      ,panel.grid.major = element_blank()
                      ,panel.grid.minor = element_blank()
                      ,legend.position="null"
                      ,legend.title=element_blank()
                      ,legend.text=element_text(size=rel(1.5))
                      ,axis.text.x=element_text(size=rel(1.5))
                      ,axis.text.y=element_text(size=rel(1.5))
                      ,axis.title=element_text(size=rel(1.5))
                      ,axis.title.y = element_text(vjust = 0.75)
                      ,plot.title = element_text(face="bold",vjust=.75,size=rel(1.75))
                    )

plot(d2.fig2)

kable(d2.sum, caption = "**Table 2**.", col.names=c("Modal Term","N","Mean Agreement", "SD", "SE", "P(sorcerer)"))


```


```{r anchor, echo=FALSE, message=FALSE, fig.width=6}

d2.sum2 <- d2 %>% filter(anchorTest != "Other") %>% 
                mutate(
                      question = factor(question,
                                        levels=c("might","could","vulnerable","fragile"))
                    ) %>%
                group_by(anchorTest,question) %>%
                     summarize(N  = length(response)
                             ,mean = mean(response, na.rm=TRUE)
                             ,sd = sd(response, na.rm=TRUE)
                             ,se   = sd/ sqrt(N)
                             #,anchor = mean(anchor,na.rm=T)
                      )

lm3 <- anova(lm(response~anchorTest * question, data=d2.anchor))
lm3.1 <- anova(lm(response~question,data=d2.anchor[d2.anchor$anchorTest==1,]))
lm3.2 <- anova(lm(response~question,data=d2.anchor[d2.anchor$anchorTest==2,]))

fragile.anchor <- t.test(d2$response[d2$anchorTest=="About the glass and the sorcerer" & d2$question=="fragile"]
                         ,d2$response[d2$anchorTest=="Only about the glass" & d2$question=="fragile"])


```

We next demonstrated that whether or not participants reported that they were thinking of the sorcerer was highly predictive of their agreement rating overall, $F$(`r lm3$Df[1]`,`r lm3$Df[4]`)=`r round(lm3$'F value'[1],digits=2)`, $p=$ `r max(.001,round(lm3$'Pr(>F)'[1],digits=3))`. Further, we can ask whether the consideration of the sorcerer when evaluating the sentence differentially affected different modal judgments. We found that it did (Figure 5). Statistically, this pattern can be captured by an interaction between the different modal questions and whether or not the anchor included the sorcerer, $F$(`r lm3$Df[3]`,`r lm3$Df[4]`)=`r round(lm3$'F value'[3],digits=2)`, $p=$ `r max(.001,round(lm3$'Pr(>F)'[3],digits=3))`. More specifically, we found when the modal anchor included both the the glass and the sorcerer, answers to the different questions differed substantially $F$(`r lm3.1$Df[1]`,`r lm3.1$Df[2]`)=`r round(lm3.1$'F value'[1],digits=2)`, $p<$ `r max(.001,round(lm3.1$'Pr(>F)'[1],digits=3))`, while they became much more similar to one another when the modal anchor included only the glass but not the sorcerer $F$(`r lm3.2$Df[1]`,`r lm3.2$Df[2]`)=`r round(lm3.2$'F value'[1],digits=2)`, $p=$ `r max(.001,round(lm3.2$'Pr(>F)'[1],digits=3))`. 

```{r Figure 5, fig.cap="**Figure 5.** Boxplots of participants\' agreement ratings with each of the four sentences about the Sorcerer\'s glass split by whether the modal anchor included the sorcerer.", echo=FALSE}

d2.fig3 <- d2 %>% mutate(question = factor(question),
                       question = factor(c("Could","Might","Fragile","Vulnerable")[question]),
                       question = factor(question, levels=c("Might","Could","Vulnerable","Fragile"))) %>%
             #filter(question %in% c("Could", "Fragile")) %>%
                     mutate(anchorTest = factor(anchorTest),
                            anchorTest = factor(c("Sorcerer too","Glass only","Other")[anchorTest])) %>%
             filter(anchorTest!="Other") %>%
             ggplot(aes(x=anchorTest, y=response, fill=question)) +
                    geom_boxplot() +
                    facet_grid(~question) +
                    geom_point(stat="identity", 
                               position=position_jitterdodge(jitter.width = .5), 
                               color="Black",alpha=.25) +
                    #scale_fill_manual(values=blackGreyPalette) + 
                    ylab("Agreement rating") +
                    xlab("") +
                    coord_cartesian(ylim=c(0,100)) +
                    theme_bw() +
                   theme(
                      plot.background = element_blank()
                      ,panel.grid.major = element_blank()
                      ,panel.grid.minor = element_blank()
                      ,legend.position="null"
                      ,legend.title=element_blank()
                      ,legend.text=element_text(size=rel(1.5))
                      ,axis.text.x=element_text(size=rel(1.5),angle = 45, vjust = 1, hjust=1)
                      ,axis.text.y=element_text(size=rel(1.5))
                      ,axis.title=element_text(size=rel(1.5))
                      ,axis.title.y = element_text(vjust = 0.75)
                      ,strip.text.x = element_text(size=rel(1.5))
                      ,plot.title = element_text(face="bold",vjust=.75,size=rel(1.75))
                    )
print(d2.fig3)


d2m <- d2 %>% filter(anchorTest!="Other" & question %in% c("fragile","could")) %>% mutate(anchorTestn = (as.numeric(factor(anchorTest))*-1)+2)

d2.anchor <- aov(lm(response ~ anchorTestn + question, data = d2m))
#summary(d2.anchor)
d2.anhor.eta <- etaSquared(d2.anchor)

## mediation analysis asking whether the effect of differences in agreement with could/might could be explained by differences in anchors:

## below we show how we ran the mediation, but have commented it out to make compiling faster and so you don't have problems where the mediation package creates conflicts with tidyverse

# library(mediation)
# 
# d2m <- d2 %>% filter(anchorTest!="Other" & question %in% c("fragile","could")) %>% mutate(anchorTestn = (as.numeric(factor(anchorTest))*-1)+2)
# 
# med.fit <- glm(anchorTestn ~ question, data = d2m, family = binomial("probit"))
# 
# out.fit <- lm(response ~ anchorTestn + question, data = d2m)
# 
# med2.out <- mediate(med.fit, out.fit, treat = "question", mediator = "anchorTestn", control.value="fragile", treat.value = "could", sims = 10000)
# 
# #saveRDS(med.out,file="models/med2_out.rda")
# 
# detach(package:mediation)

med2.out <- readRDS("models/med2_out.rda")
```

As reported in the main paper, we also asked whether the difference in agreement with *Fragile* vs. *Could* can be explained by whether or not the sorcerer was included in the modal anchor. A bootstrap mediation analysis revealed that whether the sorcerer was considered largely mediated this difference: $95\%$ *CI* of proportion mediated [`r round(med2.out$n0.ci[[1]],digits=2)`, `r round(med2.out$n0.ci[[2]],digits=2)`], $p <$ `r max(.001, round(med2.out$n.avg.p,digits=3))`.

### Supplmental discussion

Modal judgments like whether or not the glass 'might' or 'could' break take anchors that can vary dramatically in their contents and thus change the set of possibilities projected. Critically, they can take large situations that may include the sorcerer protecting the glass, and thus should be sensitive to whether the participant was considering the sorcerer when evaluating the sentence. As predicted, we find (i) large differences in agreement ratings with sentences involving these terms, (ii) correspondingly large differences in whether the modal anchor included the sorcerer, and (iii) that (i) is predicted by (ii).

In comparison, so-called "stage-level" predicates like 'vulnerable' will tend to exhibit somewhat less flexibility in their anchor argument, while still allowing for some variation because the anchor includes a spatiotemporal component that is not explicitly delimited. This aspect of 'vulnerable' suggests that we should find comparatively more agreement with the sentences involving 'vulnerable' and a correspondingly weaker tendency for the evaluations of 'vulnerability' to be affected by whether participants were considering the sorcerer. This is the pattern we observe.

Finally, for "individual-level" predicates like 'fragile', the modal anchor is provided by the syntactic subject and thus truth-value judgments of sentences involving such predicates should exhibit the least amount of variation, and correspondingly the highest level of agreement. We find the highest level of agreement in participants' evaluations of fragility and find that these judgments are the least affected by whether or not participants were considering the sorcerer when responding.

## Lexically specified domain restriction

Here we empirically investigated how the difference in the lexical specified domain restrictions encoded by two modal auxiliaries, *Could* and *Might*.

### Methods

#### Participants

```{r lexicalRestrictions data, echo=FALSE}
d3 <- read.csv("data/lexicallySpecifiedDomainRestriction.csv")

```

We collected a sample of `r length(unique(d3$ResponseId))` participants ($M_{age}$ = `r round(mean(d3$age,na.rm=T),digits=2)`; $SD_{age}$ = `r round(sd(d3$age,na.rm=T),digits=2)`; `r table(d3$gender)[[2]]` females) from Prolific ([www.prolific.com]( www.prolific.com)).

#### Procedure

All participants read the following background context:

> FitzRoy is a notoriously ruthless pirate who decided to pose as a captain of a ship in order to steal several expensive sculptures that a museum needs to transport across the sea. After he posed as an ordinary ship captain who was taking some passengers across the sea, FitzRoy got the job.

> While sailing on the sea, a large storm came upon FitzRoy and his small ship. As the waves began to grow larger, FitzRoy realized that his small vessel was too heavy and the ship would flood if he didn’t make it lighter. The only things on FitzRoy's small boat were the expensive art sculptures that he was stealing and the passengers he was transporting. He knew he had to throw something overboard to keep the ship from capsizing.

After reading, participants rated their agreement with the four following statements on a 100 point scale from 0 ('Completely Disagree') to 100 ('Completely Agree').

> FitzRoy could throw the sculptures overboard.

> FitzRoy might throw the sculptures overboard.

> FitzRoy could throw the passengers overboard.

> FitzRoy might throw the passengers overboard.

Participants then completed two comprehension questions. The first asked what FitzRoy was notorious for, and participants selected one of the three following options 

* being generous
* being ruthless
* his name

The second asked what was *not* described as being on the ship, and participants selected on of the three following options:

* Passengers
* Sculptures
* FitzRoy's wife

Lastly, participants completed a brief demographic questionnaire.

### Results

```{r lexcialRestrictions analysis, echo=FALSE,warning=FALSE,message=FALSE}

d3 <- d3 %>% select(c(9,18:22,24,33)) %>% ## takes only the participant id, the answers to the modal and control questions, and the random order info
              filter(comp1=="being ruthless" & comp2=="FitzRoy's wife") %>% ## this excludes the participants who didn't pass both control questions
              select(-c("comp1","comp2")) %>% ## removes control questions
              pivot_longer(c(2:5),names_to = "question",values_to = "response") %>% 
              mutate(question = substr(question,1,9),
                     question=factor(question),
                     modal = factor(c("Could","Could","Might","Might")[question]),
                     prejacent = factor(c("Passengers","Sculptures","Passengers","Sculptures")[question]),
                     prejacent = factor(prejacent, levels=c("Sculptures","Passengers")),
                     firstQ = substr(Scenario_DO,12,20),
                     firstResponse = case_when(
                      question==firstQ ~ T,
                                     T ~ F
                     ),
                     secondQ = substr(Scenario_DO,22,32),
                     secondQ = gsub("pt","",secondQ, fixed = TRUE),
                     secondQ = gsub("t|","",secondQ, fixed = TRUE),
                     secondResponse = case_when(
                      question==secondQ ~ T,
                                      T ~ F
                     )
                     ) %>%
             select(-c("Scenario_DO","firstQ","secondQ","question"))

d3.f <- d3 %>%  filter(firstResponse)

lm4.aov <- anova(lm(response ~ modal * prejacent, data=d3.f))
lm4.eta <- etaSquared(lm(response ~ modal * prejacent, data=d3.f))

d3f.sum <- d3.f %>% group_by(modal,prejacent) %>% 
                 filter(firstResponse) %>%
                 summarize(mean = mean(response),
                             sd = sd(response),
                              n = length(response),
                             se = sd/sqrt(n))

# Sculptures: Could vs. Might
could_might_sculpt.t <- t.test(d3.f$response[d3.f$modal=="Could" & d3.f$prejacent=="Sculptures"],
                                d3.f$response[d3.f$modal=="Might" & d3.f$prejacent=="Sculptures"])

could_might_sculpt.d <- cohensD(d3.f$response[d3.f$modal=="Could" & d3.f$prejacent=="Sculptures"],
                                d3.f$response[d3.f$modal=="Might" & d3.f$prejacent=="Sculptures"])

#Passengers: could vs. might
could_might_pass.t <-  t.test(d3.f$response[d3.f$modal=="Could" & d3.f$prejacent=="Passengers"],
                                d3.f$response[d3.f$modal=="Might" & d3.f$prejacent=="Passengers"])

could_might_pass.d <- cohensD(d3.f$response[d3.f$modal=="Could" & d3.f$prejacent=="Passengers"],
                        d3.f$response[d3.f$modal=="Might" & d3.f$prejacent=="Passengers"])


## Other tests:
# t.test(d3.f$response[d3.f$modal=="Could" & d3.f$prejacent=="Sculptures"],
#        d3.f$response[d3.f$modal=="Could" & d3.f$prejacent=="Passengers"])
# 
# cohensD(d3.f$response[d3.f$modal=="Could" & d3.f$prejacent=="Sculptures"],
#        d3.f$response[d3.f$modal=="Could" & d3.f$prejacent=="Passengers"])
# 
# t.test(d3.f$response[d3.f$modal=="Might" & d3.f$prejacent=="Sculptures"],
#        d3.f$response[d3.f$modal=="Might" & d3.f$prejacent=="Passengers"])
# 
# cohensD(d3.f$response[d3.f$modal=="Might" & d3.f$prejacent=="Sculptures"],
#        d3.f$response[d3.f$modal=="Might" & d3.f$prejacent=="Passengers"])

```

We excluded all participants who did not answer both comprehension check questions correctly (11 participants). For simplicity we focus on participants' first responses. Overall, we found the predicted interaction between the modal term used (*could* vs. *might*) and whether the modal claim concerned throwing passengers or sculptures overboard,  $F$(`r lm4.aov$Df[3]`,`r lm4.aov$Df[4]`)=`r round(lm4.aov$'F value'[3],digits=2)`, $p=$ `r max(.001,round(lm4.aov$'Pr(>F)'[3],digits=3))`, $\eta_p^2=$ `r round(lm4.eta[6],digits=3)` (see Fig. 6). Specifically, participants more agreed that FitzRoy *could* throw the sculptures overboard ($M =$ `r round(d3f.sum$mean[1], digits=3)`; $SD =$ `r round(d3f.sum$sd[1], digits=3)`) than they agreed that he *might* throw the sculptures overboard ($M =$ `r round(d3f.sum$mean[3], digits=3)`; $SD =$ `r round(d3f.sum$sd[3], digits=3)`), $t$(`r round(could_might_sculpt.t$parameter, digits=2)`) $=$ `r round(could_might_sculpt.t$statistic,digits=3)`, $p <$ `r max(.001,round(could_might_sculpt.t$p.value, digits=3))`, $d =$ `r round(could_might_sculpt.d, digits=3)`. By contrast, they less agreed that FitzRoy could throw the passengers overboard ($M =$ `r round(d3f.sum$mean[2], digits=3)`; $SD =$ `r round(d3f.sum$sd[2], digits=3)`) than they agreed that he *might* throw the sculptures overboard ($M =$ `r round(d3f.sum$mean[4], digits=3)`; $SD =$ `r round(d3f.sum$sd[4], digits=3)`), $t$(`r round(could_might_pass.t$parameter, digits=2)`) $=$ `r round(could_might_pass.t$statistic,digits=3)`, $p =$ `r max(.001,round(could_might_pass.t$p.value, digits=3))`, $d =$ `r round(could_might_pass.d, digits=3)`

```{r Figure 6, echo=FALSE, fig.cap="**Figure 6.** . Boxplots of participants’ agreement the existential modal claims when they concerned ratings FitzRoy throwing the passengers overboard (left boxes and points) or the sculptures overboard (right boxes and points), both when the existential modal used was could (red boxes) and when it was might (blue boxes). The colored boxes depict the middle two quartiles of responses, the horizontal line depicts the median response, and the small grey dots depict individual participant responses. Thick black dots depict mean agreement and error bars represent +/- 1 SEM."}

d3f.fig <- d3.f %>%  filter(firstResponse) %>%
             ggplot(aes(x=prejacent, y=response, fill=modal)) +
                    geom_boxplot() +
                    #facet_grid(~question) +
                    geom_point(stat="identity", 
                               position=position_jitterdodge(jitter.width = .4), 
                               color="Black",alpha=.25) +
                    geom_point(data=d3f.sum,aes(y=mean,x=prejacent),
                               stat="identity",
                               position=position_dodge(width=.75),
                               color="black",
                               size=2) +
                    geom_errorbar(data=d3f.sum, aes(y=mean,ymin=mean-se,ymax=mean+se,x=prejacent),
                               width=.25,
                               position=position_dodge(width=.75),
                               color="black",
                               size=1) +
                    #scale_fill_manual(values=blackGreyPalette) + 
                    ylab("Agreement Rating") +
                    xlab("") +
                    coord_cartesian(ylim=c(0,100)) +
                    theme_bw() +
                   theme(
                      plot.background = element_blank()
                      ,panel.grid.major = element_blank()
                      ,panel.grid.minor = element_blank()
                      #,legend.position="null"
                      ,legend.title=element_blank()
                      ,legend.text=element_text(size=rel(1.5))
                      ,axis.text.x=element_text(size=rel(1.5))
                      ,axis.text.y=element_text(size=rel(1.5))
                      ,axis.title=element_text(size=rel(1.5))
                      ,axis.title.y = element_text(vjust = 0.75)
                      ,strip.text.x = element_text(size=rel(1.5))
                      ,plot.title = element_text(face="bold",vjust=.75,size=rel(1.75))
                    )

print(d3f.fig)


## Not included but may be interesting for people who want to know about whether we find the predicted domain expansion order effects hold - we basically do:

# d3.fs <- d3 %>%  filter(firstResponse|secondResponse) %>%
#                   mutate(responseOrder = case_when(
#                     firstResponse ~ "First",
#                     secondResponse ~ "Second"
#                   )) %>%
#                   select(-c(firstResponse,secondResponse))
# 
# d3fs.sum <- d3.fs %>% group_by(modal,prejacent,responseOrder) %>% 
#                  summarize(mean = mean(response),
#                              sd = sd(response),
#                               n = length(response),
#                              se = sd/sqrt(n))
# 
# 
# d3.fig2 <- d3.fs %>%
#              ggplot(aes(x=modal, y=response, fill=responseOrder)) +
#                     geom_boxplot() +
#                     facet_grid(~prejacent) +
#                     geom_point(stat="identity", 
#                                position=position_jitterdodge(jitter.width = .4), 
#                                color="Black",alpha=.25) +
#                     geom_point(data=d3fs.sum,aes(y=mean,x=modal),
#                                stat="identity",
#                                position=position_dodge(width=.75),
#                                color="black",
#                                size=2) +
#                     geom_errorbar(data=d3fs.sum, aes(y=mean,ymin=mean-se,ymax=mean+se,x=modal),
#                                width=.25,
#                                position=position_dodge(width=.75),
#                                color="black",
#                                size=1) +
#                     #scale_fill_manual(values=blackGreyPalette) + 
#                     ylab("Agreement Rating") +
#                     xlab("") +
#                     coord_cartesian(ylim=c(0,100)) +
#                     theme_bw() +
#                    theme(
#                       plot.background = element_blank()
#                       ,panel.grid.major = element_blank()
#                       ,panel.grid.minor = element_blank()
#                       #,legend.position="null"
#                       ,legend.title=element_blank()
#                       ,legend.text=element_text(size=rel(1.5))
#                       ,axis.text.x=element_text(size=rel(1.5))
#                       ,axis.text.y=element_text(size=rel(1.5))
#                       ,axis.title=element_text(size=rel(1.5))
#                       ,axis.title.y = element_text(vjust = 0.75)
#                       ,strip.text.x = element_text(size=rel(1.5))
#                       ,plot.title = element_text(face="bold",vjust=.75,size=rel(1.75))
#                     )

```




